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Art Unit: 2624 

DETAILED ACTION 

Response to Arguments 

Applicant's election of claims 1-10 of Group I in the reply filed on 1 1/04/2008 is 
acknowledged. Because applicant did not distinctly and specifically point out the 
supposed errors in the restriction requirement, the election has been treated as an 
election without traverse (MPEP 818.03(a)). Applicant has amended claims 23-29 to 
depend from independent claim 1 of Group I. Therefore, claims 1-10 and 23-29 are 
examined as set fourth below. 

Specification 

The title of the invention is not descriptive. A new title is required that is clearly 
indicative of the invention to which the claims are directed. 

Claim Objections 

Claims 1 and 5 are objected to because of the following informalities: Claim 5 
includes the limitation "processing target frame", which does not make sense. Does this 
mean that the target frame performs processing? Claim 5 includes the limitation "is 
angles made by", but this is grammatically incorrect. Appropriate correction is required. 

Claim Rejections - 35 USC § 102 

(b) the invention was patented or described in a printed publication in this or a foreign country or in public 
use or on sale in this country, more than one year prior to the date of application for patent in the United 
States. 
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Claims 1-9 and 23-29 are rejected under 35 U.S.C. 102(b) as being anticipated 
by J. Heinzmann and A. Zelinsky, "3-D facial pose and gaze point estimation using a 
robust real-time tracking paradigm," IEEE Int. Workshop on Automatic Face and 
Gesture Recognition, pp142-147, 1998) (Heinzmann). 

As per claim 1 , Heinzmann teaches an image processing apparatus for estimating a 
motion of a predetermined feature point of a 3D object from a motion picture of the 3D 
object taken by a monocular camera, comprising (Limitations present only within the 
preamble are not given patentable weight): 

observation vector extracting means for extracting projected coordinates 
of the predetermined feature point onto an image plane, from each of frames of the 
motion picture (Heinzmann: page 142, col 2, para 2: "forwarded to the 2-D model... 
image plane... 2-D image positions of the features"; Fig. 1); 

3D model initializing means for making the observation vector extracting 
means extract from an initial frame of the motion picture, initial projected coordinates in 
a model coordinate arithmetic expression for calculation of model coordinates of the 
predetermined feature point on the basis of a first parameter, a second parameter, and 
the initial projected coordinates (Heinzmann: Fig. 1; abstract: "3-D model... initialize 
the feature tracking": paramaters: abstract: "feature positions... gaze direction... 
head rotation": Fig. 1: "feature positions... relative positions". Fig. 1 shows that 
the projected coordinates are extracted from the 2-D model into the 3-D model, 
page 142, col 2, para 2: "2-D image positions of the features are transferred to a 3- 
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D model of the feature locations"; page 144, col 1, para 5 - col 2, para 1 : "affine 
transformation... a good approximation of perspective projection provided the 
depth of the object does not exceed 1/10 of the distance between camera and 
object. This is usually the case in face tracking applications.": Therefore, a 
parameter is the depth of object which is not expected to exceed 1/10 of the 
distance between the camera and the object; page 144, col 2, paras 2-3: "angle"; 
page 144, col 2, paras 4-5: "theta... orientations"; Fig. 3: "camera coordinates- 
angles"; Fig. 2: "angles"; page 145, col 1, para 3 - col 2, para 1: "distance and 
orientation"; page 145, col 2, para 2: "depth". Fig. 4: shows 9 parameters, page 
146, col 1, para 3 - col 2, para 1: "Figure 4 shows the output of some tracking 
parameter including the rotational angles, the displacement, the gaze direction of 
both eyes and the uncertainty of the face tracking"); and 

motion estimating means for calculating estimates of state variables 
including a third parameter in a motion arithmetic expression for calculation of 
coordinates of the predetermined feature point at a time of photography when a 
processing target frame of the motion picture different from the initial frame was taken, 
from the model coordinates, the first parameter, and the second parameter, and for 
outputting an output value about the motion of the predetermined feature point on the 
basis of the second parameter included in the estimates of the state variables 
(Heinzmann: page 142, col 2, para 2: "The estimated positions of the features 
determine the location within the next image frame of the hardware search 
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windows." Note that the state variables include the parameters that were listed 
above: page 142, col 2, para 2: "3-D triplets"; Fig. 1: "3-D pose": output), 

wherein the model coordinate arithmetic expression is based on back 
projection of the monocular camera, the first parameter is a parameter independent of a 
local motion of a portion including the predetermined feature point, and the second 
parameter is a parameter dependent on the local motion of the portion including the 
predetermined feature point (Heinzmann: page 142, col 2, para 2: "The 3-D model is 
also projected back into the image plane to adapt the constraints in the 2-D 
model."; abstract: "monocular"; page 144, col 1, para 5 - col 2, para 1 : "affine 
transformation... a good approximation of perspective projection provided the 
depth of the object does not exceed 1/10 of the distance between camera and 
object. This is usually the case in face tracking applications.": Therefore, a 
parameter is the depth of object which is not expected to exceed 1/10 of the 
distance between the camera and the object. This parameter is independent of 
local motion. Note that parameters are also listed above, page 145, col 2, para 2: 
"monocular"), and 

wherein the motion estimating means: 
calculates predicted values of the state variables at the time of photography when the 
processing target frame was taken, based on a state transition model (Heinzmann: 
page 143, col 2, para 6: "probabilistic relocation of features based on template 
correlations and a simple 2-D facial model"); 
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applies the initial projected coordinates, and the first parameter and the 
second parameter included in the predicted values of the state variables, to the model 
coordinate arithmetic expression to calculate estimates of the model coordinates at the 
time of photography (Heinzmann: Fig. 1 and 3: Note that every frame corresponds 
to a time a photography); 

applies the third parameter in the predicted values of the state variables 
and the estimates of the model coordinates to the motion arithmetic expression to 
calculate estimates of coordinates of the predetermined feature point at the time of 
photography (Heinzmann: page 146, col 1, para 1-2: third parameter can be 
interpreted to be confidence Figs. 1 and 3. See arguments made above for 
parameters.); 

applies the estimates of the coordinates of the predetermined feature point 
to an observation function based on an observation model of the monocular camera to 
calculate estimates of an observation vector of the predetermined feature point 
(Heinzmann: page 146, col 1, para 1-2. Figs. 1 and 3); 

makes the observation vector extracting means extract the projected 
coordinates of the predetermined feature point from the processing target frame, as the 
observation vector (Heinzmann: page 145, col 1, para 3: "gaze vector"; Figs. 1 and 
3; page 146, col 1, para 2: "gaze vector"); and 

filters the predicted values of the state variables by use of the extracted 
observation vector and the estimates of the observation vector to calculate estimates of 
the state variables at the time of photography (Heinzmann: Fig. 1: "Kalman filtering". 
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Note that every frame corresponds to a time of photography. As stated above, the 
state variables include the parameters. A coordinate is an observation vector 
originating from the origin in the corresponding coordinate space; page 145, col 
1, para 3: "gaze vector"; Figs. 1 and 3; page 146, col 1, para 2: "gaze vector... 
Intersecting G, with a world model yields the gaze point"). 

As per claim 2, Heinzmann teaches the image processing apparatus according to 
Claim 1 , wherein the first parameter is a static parameter to converge at a specific 
value, and wherein the second parameter is a dynamic parameter to vary with the 
motion of the portion including the predetermined feature point (Heinzmann: See 
arguments made for rejection claim 1: The static parameter can be interpreted to 
be the length (or depth) of the gaze vector that converges to a specific gaze point 
(page 146, col 1, para 2).The second dynamic value is the angle or orientation that 
varies over time along with the motion). 

As per claim 3, Heinzmann teaches the image processing apparatus according to 
Claim 2, wherein the static parameter is a depth from the image plane to the 
predetermined feature point (Heinzmann: See arguments made for rejection claim 1, 
2: The depth of the feature from the image plane is considered as a parameter.). 

As per claim 4, Heinzmann teaches the image processing apparatus according 
to Claim 2, wherein the dynamic parameter is a rotation parameter for specifying a 
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rotation motion of the portion including the predetermined feature point (Heinzmann: 
See arguments made for rejection claim 1, 2: The rotation is considered as a 
parameter). 

As per claim 5, Heinzmann teaches the image processing apparatus according to 
Claim 4, wherein the rotation parameter is angles made by a vector from an origin to the 
predetermined feature point, relative to two coordinate axes in a coordinate system 
whose origin is at a center of the portion including the predetermined feature point 
(Heinzmann: See arguments made for rejection claim 1: page 146, col 1: "eye 
orientation... alpha_x, alpha_y... origin is located between the eyes"). 

As per claim 6, Heinzmann teaches the image processing apparatus according to 
Claim 1 , wherein the first parameter is a rigid parameter, and wherein the second 
parameter is a non-rigid parameter (Heinzmann: See arguments made for rejection 
claim 1, 2: The depth is the rigid parameter, and the angle/orientation is the non- 
rigid-parameter. Also, affine and perspective transformations are non-rigid 
transformation, but the depth would not be affected by the transformations). 

As per claim 7, Heinzmann teaches the image processing apparatus according to 
Claim 6, wherein the rigid parameter is a depth from the image plane to the model 
coordinates (Heinzmann: See arguments made for rejection claim 1, 6.). 
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As per claim 8, Heinzmann teaches the image processing apparatus according 
to Claim 6, wherein the non-rigid parameter is a change amount about a position 
change of the predetermined feature point due to the motion of the portion including the 
predetermined feature point (Heinzmann: See arguments made for rejection claim 1, 
5.) 

As per claim 9, Heinzmann teaches the image processing apparatus according 
to Claim 1 , wherein the motion model is based on rotation and translation 
motions of the 3D object, and wherein the third parameter is a translation parameter for 
specifying a translation amount of the 3D object and a rotation parameter for specifying 
a rotation amount of the 3D object (Heinzmann: See arguments made for rejection 
claim 1, 2, and 5: Fig. 4: "DispX... DispY": translation; Fig. 1: "template tracking" 
Template tracking or matching accounts for in-plane translations. Fig. 3: "camera 
coordinates... angles"; Fig. 1). 

As per claim 23, Heinzmann teaches the image processing apparatus according 
to Claim 1 , wherein a 3D structure of a center of a pupil on a facial picture is 
defined by a static parameter and a dynamic parameter, and wherein the a gaze is 
determined by estimating the static parameter and the dynamic parameter 
(Heinzmann: See arguments made for rejection claim 1, 2, 5, 9). 
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As per claim 24, Heinzmann teaches the image processing apparatus according 
to Claim 23, wherein the static parameter is a depth of the pupil in a camera coordinate 
system (Heinzmann: See arguments made for rejection claim 1, 2, 5, 9). 

As per claim 25, Heinzmann teaches the image processing apparatus according 
to Claim 23, wherein the dynamic parameter is a rotation parameter of an eyeball 
(Heinzmann: See arguments made for rejection claim 1, 2, 5, 9). 

As per claim 26, Heinzmann teaches the image processing apparatus according 
to Claim 25, wherein the rotation parameter of the eyeball has two degrees of freedom 
to permit rotations with respect to two coordinate axes in an eyeball coordinate system 
(Heinzmann: See arguments made for rejection claim 1, 2, 5, 9: alpha_x, alpha_y). 

As per claim 27, Heinzmann teaches the image processing apparatus according 
to Claim 1 , wherein a 3D structure of the 3D object on the a picture is defined by a 
rigid parameter and a non-rigid parameter and wherein the motion of the 3D object is 
determined by estimating the rigid parameter and the non-rigid parameter (Heinzmann: 
See arguments made for rejection claim 1, 2, 5, 6, 9). 

As per claim 28, Heinzmann teaches the image processing apparatus according 
to Claim 27, wherein the rigid parameter is a depth of a feature point of the 3D object in 
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a model coordinate system (Heinzmann: See arguments made for rejection claim 1, 
2, 5, 6, 9). 

As per claim 29, Heinzmann teaches the image processing apparatus according 
to Claim 27, wherein the non-rigid parameter is a change amount of a feature point of 
the 3D object in a model coordinate system (Heinzmann: See arguments made for 
rejection claim 1, 2, 5, 6, 9). 

Claim Rejections - 35 USC § 103 

The following is a quotation of 35 U.S.C. 1 03(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set 
forth in section 102 of this title, if the differences between the subject matter sought to be patented and 
the prior art are such that the subject matter as a whole would have been obvious at the time the 
invention was made to a person having ordinary skill in the art to which said subject matter pertains. 
Patentability shall not be negatived by the manner in which the invention was made. 

Claim 10 is rejected under 35 U.S.C. 103(a) as being unpatentable over J. 
Heinzmann and A. Zelinsky, "3-D facial pose and gaze point estimation using a robust 
real-time tracking paradigm," IEEE Int. Workshop on Automatic Face and Gesture 
Recognition, pp142-147, 1998) (Heinzmann) as applied to claim 1 above, and further in 
view of Park, K. R., et al., "Gaze position detection by computing the three dimensional 
facial positions and motions," Pattern Recognition, Vol. 35, No. 11, Nov. 2002, pp. 
2559-2569 (Park). 
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As per claim 10, Heinzmann teaches the image processing apparatus according 
to Claim 1, wherein the motion estimating means applies Kalman filtering as 
said filtering (Heinzmann: See arguments made for rejecting claim 1). Heinzmann 
does not teach extended Kalman filtering. 

Park teaches extended Kalman filtering (Park: page 2564, col 1, para 4: "extended 
Kalman"). 

Thus, it would have been obvious for one of ordinary skill in the art at the time the 
invention was made to implement the teachings of Park into Heinzmann since 
Heinzmann suggests a system for determining face and gaze positions using Kalman 
filtering in general and Park suggests the beneficial use of a system for determining 
face and gaze positions using extended Kalman filtering as to in the analogous art of 
image processing. It would have been obvious for one of ordinary skill in the art at the 
time the invention was made to implement the teachings of Park into Heinzmann since it 
is well known that the extended Kalman filter is applicable to nonlinear problems 
whereas the Kalman filter is not. Therefore, one can apply the extended Kalman filter in 
order to obtain a more robust system. Furthermore, one of ordinary skill in the art at the 
time the invention was made could have combined the elements as claimed by known 
methods and, in combination, each component functions the same as it does 
separately. One of ordinary skill in the art at the time the invention was made would 
have recognized that the results of the combination would be predictable. 
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Conclusion 

Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to Atiba Fitzpatrick whose telephone number is (571) 270- 
5255. The examiner can normally be reached on M-F 10:00am-6pm. 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, Samir Ahmed can be reached on (571)272-7413. The fax phone number for 
the organization where this application or proceeding is assigned is 571-273-8300. 

Information regarding the status of an application may be obtained from the Patent 
Application Information Retrieval (PAIR) system. Status information for published 
applications may be obtained from either Private PAIR or Public PAIR. Status 
information for unpublished applications is available through Private PAIR only. For 
more information about the PAIR system, see http://pair-direct.uspto.gov. Should you 
have questions on access to the Private PAIR system, contact the Electronic Business 
Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO 
Customer Service Representative or access to the automated information system, call 
800-786-9199 (IN USA OR CANADA) or 571-272-1000. 
Atiba Fitzpatrick 
Ik. O. FV 

Examiner, Art Unit 2624 
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Supervisory Patent Examiner, Art Unit 2624 



